A distributed selectivity-driven search strategy for semi-structured data over DHT-based networks

نویسندگان

  • Carmela Comito
  • Domenico Talia
  • Paolo Trunfio
چکیده

Distributed Hash Tables (DHTs) are widely used for indexing and locating many types of resources, including semi-structured data modeled as XML documents. A common distributed strategy to process an XML query over a DHT consists in splitting it into a set of simple path queries, and resolving each of them separately. The traffic generated by this strategy grows with the number of paths in the query. To overcome this drawback, an alternative strategy consists in resolving only the sub-query associated with the most selective path, and then submitting the original query to the nodes in the result set. A first goal of this paper is to provide an analytical and experimental study of the two strategies to assess their relative merits in different scenarios. On the basis of this study, we introduce an Adaptive Path Selection (APS) search technique that resolves an XML query in a distributed way by querying either the most selective path or the whole path set, based on the selectivity of the paths in the query. The effective use of APS requires that the querying nodes know in advance the selectivity of all the paths. Addressing this problem is another goal of the paper, which is achieved through: i) The definition of a space-efficient data structure, the Path Selectivity Table (PST), which given any path, returns an estimate of its selectivity. ii) The definition of an efficient strategy that builds the PST in a distributed way and propagates it to all nodes in the network with logarithmic performance bounds and without redundant messages. Experimental results show that the PST accurately estimates the path selectivity values, and that the traffic generated by the APS algorithm using PST-estimated selectivity values is comparable to that produced by APS assuming to know the real path selectivity values.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Enabling Dynamic Querying over Distributed Hash Tables

Dynamic querying (DQ) is a search technique used in unstructured peer-topeer (P2P) networks to minimize the number of nodes that is necessary to visit to reach the desired number of results. In this paper we introduce the use of the DQ technique in structured P2P networks. In particular, we present a P2P search algorithm, named DQ-DHT (Dynamic Querying over a Distributed Hash Table), to perform...

متن کامل

Implementing Dynamic Querying Search in k-ary DHT-based Overlays

Distributed Hash Tables (DHTs) provide scalable mechanisms for implementing resource discovery services in structured Peer-to-Peer (P2P) networks. However, DHT-based lookups do not support some types of queries which are fundamental in several classes of applications. A way to support arbitrary queries in structured P2P networks is implementing unstructured search techniques on top of DHT-based...

متن کامل

Optimally Efficient Prefix Search and Multicast in Structured P2P Networks

Searching in P2P networks is fundamental to all overlay networks. P2P networks based on Distributed Hash Tables (DHT) are optimized for single key lookups, whereas unstructured networks offer more complex queries at the cost of increased traffic and uncertain success rates. Our Distributed Tree Construction (DTC) approach enables structured P2P networks to perform prefix search, range queries, ...

متن کامل

Dynamic Querying in Structured Peer-to-Peer Networks

Dynamic Querying (DQ) is a technique adopted in unstructured Peer-to-Peer (P2P) networks to minimize the number of peers that is necessary to visit to reach the desired number of results. In this paper we introduce the use of the DQ technique in structured P2P networks. In particular, we present a P2P search algorithm, named DQ-DHT (Dynamic Querying over a Distributed Hash Table), to perform DQ...

متن کامل

Searching Techniques in Peer-to-Peer Networks

This chapter provides a survey of major searching techniques in peer-to-peer (P2P) networks. We first introduce the concept of P2P networks and the methods for classifying different P2P networks. Next, we discuss various searching techniques in unstructured P2P systems, strictly structured P2P systems, and loosely structured P2P systems. The strengths and weaknesses of these techniques are high...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. Parallel Distrib. Comput.

دوره 93-94  شماره 

صفحات  -

تاریخ انتشار 2016